params <- list(
n_age_init = 55,
n_age_max = 65,
d_c = 0.03,
d_e = 0.03,
n_sim = 1000,
r_HS = 0.1,
hr_SD = 1.9, # hazard ratio on rate
# overall death
# p_HD is the time-varying background mortality
# p_SD is the time-varying background mortality * 1.9(Hazard Ratio)
hr_HS_trt = 0.5,
u_H = 1,
u_S = 0.773,
u_D = 0,
c_H = 8494, # Non-CVD health care cost
c_trt = 840,
c_S = 3917 + 8494, # Cost of annual follow-up post ASCVD
c_nonfatal_firstYr = 49348,
c_fatal_firstYr = 16760,
c_D = 0,
cycle_length = 1,
n_cycles = 10
)
v_names_states = c("Healthy", "Sick", "Death")
n_states = length(v_names_states)
v_names_str = c("quo","trt")
n_strategies = length(v_names_str)SMDM Europe Worked Examples
Questions and To-Do
[Shawn] In our appendix example, we have a two-cycle tunnel state (T1 and T2) once you become ill. We allow for exits to background mortality from T1, but everyone in T2 exits to the receiving bucket “N”. I assume we don’t need exits to background mortality because we’re only using these for accounting purposes – but if it’s for a transient utility decrement, don’t we only need to apply the utility decrement to the non-decedents within T2?
[Shawn] Interestingly, in the HIV model, when I add in background mortality for Germany (based on modeled life table data), the resulting transition probability matrix (for an initial cohort of 40 year olds) results in a negative probability for transitions to secular death. I had to add in a parameter which is a hazard ratio for the HIV population (essentially multiplying the background mortality rate by 2) to get positive probabilities. But could use this as an example to show that embedding from the solved generator matrix doesn’t guarantee non-zero probabilities?
[Shawn] Need the optimization algorithm to solve for PSA parameters that minimizes:
(F(x_1) - p_1)^2 + (F(x_2)-p_2)^2
[Shawn]. Can we also add analytic solution code for lognormal and logistic? Or just rely on the optimization?
[All] Do we need a full-fledged CEA for the CVD model? Or would just creating the natural disease model suffice?
[Astrid] The life table data at the global burden of disease website is split into different files for different age buckets. Can we download these files and aggretate them together to get a full life table for each country/region/year combo?
[Astrid] Perhaps we could create a Shiny app that does the above?
- Input: country/region and year
- Output 1: Constructed life table
- Output 2: Parameters for mortality model
0. Setup
We begin by specifying a parameter object that will ultimately guide the discrete time model.
1. Mortality Modeling (Alive-Dead)
Estimate a mortality model to parameterize background mortality.
Background mortality will be based on a mortality model. This reduces the parameterization from the total number of age bins in the life table data to only a few parameters that can be used to characterize background mortality.
First we download and store the vital statistics data for the US from the human mortality database:
Next we construct a life table from these data. This is done using the demography package.
#############################################
# Define Parameters for Life Table Modeling
#############################################
mort_year = # Year to obtain from Human Mortality Database
2019
N = # Population basis for life table
100000
max_age = # Max age in life table
99
min_age = # Min age in life table
0
n_cycles = # Number of discrete time (annual) cycles in Markov
100
###############################
# Underlying Life-Table Data
###############################
lt <- # Read in the U.S. life table data from the human mortality database.
read_rds(here("_sandbox/mortality/usa-life-table.rds")) %>%
lifetable(.,series = "total", years = mort_year) %>%
as_tibble() %>%
mutate_at(vars(lx,dx), function(x) x*N) %>%
mutate(country = "USA") %>%
mutate(age = x)
lt %>%
ungroup() %>%
select(-x) %>%
select(country,age,everything()) %>%
head() %>%
kable() %>%
kable_styling()| country | age | year | mx | qx | lx | dx | Lx | Tx | ex |
|---|---|---|---|---|---|---|---|---|---|
| USA | 0 | 2019 | 0.00556 | 0.00553 | 100000 | 553.221 | 0.99482 | 78.963 | 78.963 |
| USA | 1 | 2019 | 0.00038 | 0.00038 | 99447 | 38.180 | 0.99428 | 77.968 | 78.402 |
| USA | 2 | 2019 | 0.00023 | 0.00023 | 99409 | 23.160 | 0.99397 | 76.974 | 77.432 |
| USA | 3 | 2019 | 0.00018 | 0.00018 | 99385 | 17.689 | 0.99377 | 75.980 | 76.450 |
| USA | 4 | 2019 | 0.00014 | 0.00014 | 99368 | 14.209 | 0.99361 | 74.986 | 75.463 |
| USA | 5 | 2019 | 0.00013 | 0.00013 | 99354 | 13.213 | 0.99347 | 73.993 | 74.474 |
The columns here are
age: Ages for lifetableyear: Period years or cohort yearsmx: Death rate at age x.qx: The probability that an individual of exact age x will die before exact age x+1.lx: Number of survivors to exact age x. This is defined relative to a radix, or the size of a cohort from which the life table is derived).dx: The number of deaths between exact ages x and x+1.Lx: Number of years lived between exact age x and exact age x+1.Tx: Number of years lived after exact age x.ex: Remaining life expectancy at exact age x.
Our next step is to fit a mortality model to these data. Generally speaking, we need three inputs:
age: Ages for lifetabledx: The number of deaths between exact ages x and x+1.lx: Number of survivors to exact age x. This is defined relative to a radix, or the size of a cohort from which the life table is derived).
The MortalityLaws package has a number of mortality models we can draw from. Table 1 summarizes these.
laws <- availableLaws()
laws$table %>%
arrange(TYPE) %>%
kable() %>%
kable_styling()| YEAR | NAME | MODEL | TYPE | CODE | FIT | SCALE_X |
|---|---|---|---|---|---|---|
| 1870 | Opperman | mu[x] = A/sqrt(x) - B + C*sqrt(x) | 1 | opperman | mu[x] | FALSE |
| 1939 | Weibull | mu[x] = 1/sigma * (x/M)^(M/sigma - 1) | 1 | weibull | mu[x] | FALSE |
| NA | Inverse-Gompertz | mu[x] = [1- exp(-(x-M)/sigma)] / [exp(-(x-M)/sigma) - 1] | 2 | invgompertz | mu[x] | TRUE |
| NA | Inverse-Weibull | mu[x] = 1/sigma * (x/M)^[-M/sigma - 1] / [exp((x/M)^(-M/sigma)) - 1] | 2 | invweibull | mu[x] | TRUE |
| 1825 | Gompertz | mu[x] = A exp[Bx] | 3 | gompertz | mu[x] | TRUE |
| NA | Gompertz | mu[x] = 1/sigma * exp[(x-M)/sigma)] | 3 | gompertz0 | mu[x] | TRUE |
| 1860 | Makeham | mu[x] = A exp[Bx] + C | 3 | makeham | mu[x] | TRUE |
| NA | Makeham | mu[x] = 1/sigma * exp[(x-M)/sigma)] + C | 3 | makeham0 | mu[x] | TRUE |
| 1932 | Perks | mu[x] = [A + BC^x] / [BC^-x + 1 + DC^x] | 3 | perks | mu[x] | TRUE |
| 1960 | Strehler-Mildvan | mu[x] = K * exp[-V0 * (1 - Bx)/D] | 3 | strehler_mildvan | mu[x] | TRUE |
| 1943 | Van der Maen | mu[x] = A + Bx + Cx^2 + I/[N - x] | 4 | vandermaen | mu[x] | TRUE |
| 1971 | Beard | mu[x] = A exp(Bx) / [1 + KA exp(Bx)] | 4 | beard | mu[x] | TRUE |
| 1971 | Beard-Makeham | mu[x] = A exp(Bx) / [1 + KA exp(Bx)] + C | 4 | beard_makeham | mu[x] | TRUE |
| 1979 | Gamma-Gompertz | mu[x] = A exp(Bx) / (1 + AG/B * [exp(Bx) - 1]) | 4 | ggompertz | mu[x] | TRUE |
| 1943 | Van der Maen | mu[x] = A + Bx + I/[N - x] | 5 | vandermaen2 | mu[x] | TRUE |
| NA | Quadratic | mu[x] = A + Bx + Cx^2 | 5 | quadratic | mu[x] | TRUE |
| 1998 | Kannisto | mu[x] = A exp(Bx) / [1 + A exp(Bx)] | 5 | kannisto | mu[x] | TRUE |
| 1998 | Kannisto-Makeham | mu[x] = A exp(Bx) / [1 + A exp(Bx)] + C | 5 | kannisto_makeham | mu[x] | TRUE |
| 1871 | Thiele | mu[x] = A exp(-Bx) + C exp[-.5D (x-E)^2] + F exp(Gx) | 6 | thiele | mu[x] | FALSE |
| 1883 | Wittstein | q[x] = (1/B) A^-[(Bx)^N] + A^-[(M-x)^N] | 6 | wittstein | q[x] | FALSE |
| 1979 | Siler | mu[x] = A exp(-Bx) + C + D exp(Ex) | 6 | siler | mu[x] | FALSE |
| 1980 | Heligman-Pollard | q[x]/p[x] = A^[(x + B)^C] + D exp[-E log(x/F)^2] + G H^x | 6 | HP | q[x] | FALSE |
| 1980 | Heligman-Pollard | q[x] = A^[(x + B)^C] + D exp[-E log(x/F)^2] + GH^x / [1 + GH^x] | 6 | HP2 | q[x] | FALSE |
| 1980 | Heligman-Pollard | q[x] = A^[(x + B)^C] + D exp[-E log(x/F)^2] + GH^x / [1 + KGH^x] | 6 | HP3 | q[x] | FALSE |
| 1980 | Heligman-Pollard | q[x] = A^[(x + B)^C] + D exp[-E log(x/F)^2] + GH^(x^K) / [1 + GH^(x^K)] | 6 | HP4 | q[x] | FALSE |
| 1983 | Rogers-Planck | q[x] = A0 + A1 exp[-Ax] + A2 exp[B(x - u) - exp(-C(x - u))] + A3 exp[Dx] | 6 | rogersplanck | q[x] | FALSE |
| 1987 | Martinelle | mu[x] = [A exp(Bx) + C] / [1 + D exp(Bx)] + K exp(Bx) | 6 | martinelle | mu[x] | FALSE |
| 1992 | Carriere | l[x] = P1 l[x](weibull) + P2 l[x](invweibull) + P3 l[x](gompertz) | 6 | carriere1 | q[x] | TRUE |
| 1992 | Carriere | l[x] = P1 l[x](weibull) + P2 l[x](invgompertz) + P3 l[x](gompertz) | 6 | carriere2 | q[x] | TRUE |
| 1992 | Kostaki | q[x]/p[x] = A^[(x+B)^C] + D exp[-(E_i log(x/F_))^2] + G H^x | 6 | kostaki | q[x] | FALSE |
Because we want to stay general (i.e., model all over the age spectrum), our first attempt will be a Gompertz model.
Gompertz doesn’t fit all age ranges well—particularly young ages. This is a well-known fact. If we were to focus only on adults, however, this would be a nice way to go.
Here, for example, is a Gompertz model fit to 40 year old adults:
The nice thing about this is that it summarizes age-specific mortality in terms of just two parameters. We can then feed these parameters into the gompertz() formula to yield a death rate for a given age. Here is the mortality rate at age 75 for an initial population of 40 year olds:
MortalityLaws::gompertz(x = 75 - 40, coef(gom_fit40))$hx
[1] 0.030479
$par
A B
0.0014654 0.0867125
$Sx
[1] 0.71563
For this exercise we want to stay general, so let’s select an alternative model that can better fit the entire age range. To get a sense of the options, let’s select the models that are designed for the entire age range and see their fit:
Here we see that some do better than others. We’ll choose the HP2 (Heligman-Bollard) to fit, though note that any well-fitting mortality model can be used because there is a mapping of the fitted parameters to the mortality rate for each (see Table 1). The MortalityLaws package also has a function for each, much like the gompertz() function above. For our selected model, the function is HP2().
Let’s add the fitted mortality model parameters to our overall parameter list object. We’ll include gompertz , too, just for comparison later.
params <- modifyList(params,list(heligman_bollard = coef(hp_fit),
gompertz = coef(gom_fit)))Verify that a discrete time Markov model can replicate life-expectancy at age X.
Now we’ll answer the question: can we replicate life expectancy based on a life table using a discrete time Markov model?
To do this, we’ll consruct a very simple two-state discrete time Markov model. We’ll then parameterize the model with the coefficients from the mortality models above, and then calculate life expectancy with a QALY “payoff” of 1.0 if alive and no discounting.
Let’s start with a cohort of 40 year olds. Based on the life table, we should expect these individuals to live an additional 41 years
starting_age = 40
tr_ <- tr_alt_ <- # Start the Markov trace
bind_rows(c("alive" = 1, "dead" = 0)) %>% mutate(t = 0) %>% select(t,alive,dead)
for (.x in 1:n_cycles) {
r_death <-
HP2(.x + starting_age - 1, params$heligman_bollard)$hx
## Embedded Transition Probability Matrix
m_Q <- # construct the transtiion rate matrix (2x2 for alive-dead)
matrix(
c(-(r_death), r_death, 0, 0),
byrow = TRUE,
ncol = 2,
dimnames = list(c("alive", "dead"), c("alive", "dead"))
)
m_P <- # embed the transition probability matrix
expm(m_Q)
p_death <- 1 - exp(-r_death)
## Alternative version using standard rate-to-probability conversion formulas
m_P_ <-
matrix(c((1 - p_death),p_death,0,1), byrow=TRUE,
ncol=2,
dimnames = list(c("alive","dead"),c("alive","dead")))
tmp_ <- # current state occupancy
c(tr_[.x, "alive"], tr_[.x, "dead"]) %>% unlist()
tr_ <- # add next row of the Markov trace
bind_rows(
tr_,
(tmp_ %*% m_P) %>%
as.matrix() %>%
data.frame() %>%
mutate(t = .x) %>%
dplyr::select(t, alive, dead)
)
tmp_ <- # current state occupancy
c(tr_alt_[.x, "alive"], tr_alt_[.x, "dead"]) %>% unlist()
tr_alt_ <- # add next row of the Markov trace
bind_rows(
tr_alt_,
(tmp_ %*% m_P_) %>%
as.matrix() %>%
data.frame() %>%
mutate(t = .x) %>%
dplyr::select(t, alive, dead)
)
}
# Cycle adjustment
cycle_adj <-
c(rep(c(4/3,2/3),n_cycles/2),1/3)
cycle_adj[1] <- 1/3
# Life-Expectancy via the discrete time Markov
payoff_lifeexp <- c(1,0)
life_exp_markov <-
t(cycle_adj) %*% tr_$alive
life_exp_markov_alt <-
t(cycle_adj) %*% tr_alt_$alive
# Life-Expectancy via the life table
life_exp_lifetable <-
lt %>% filter(age==starting_age) %>% pull(ex)
tibble(markov = as.vector(life_exp_markov),
markov_alt = as.vector(life_exp_markov_alt),life_table = life_exp_lifetable) %>%
kable(digits = 2) %>%
kable_styling()| markov | markov_alt | life_table |
|---|---|---|
| 40.73 | 40.73 | 41 |
We’ll now loop through various starting ages to verify that we can replicate life expectancy with a simple discrete time Markov model.
Figure 2 Shows that modeled life expectancy based on the Helligman-Bollard approach successfully replicates average life expectancy for nearly any age cohort. The figure also shows that the Gompertz model performs well among middle-aged adult cohorts (e.g., 40) but not for younger or older age cohorts.
2. Incorporating a Disease State (Healthy-Sick-Dead)
Split “Alive” health state into healthy vs. sick
Let’s now expand our simple model to include a “sick” state. For now we’ll assume you’re either healthy or sick, but that being sick does not confer a higher mortality rate:
Verify that a discrete time Markov model can replicate life-expectancy at age X.
Starting with a cohort aged 40, can we successfully replicate life expectancy again?
starting_age = .x = 40
tr_ <- tr_alt_ <-
bind_rows(c("healthy" = 1, "sick" = 0, "dead" = 0)) %>% mutate(t = 0) %>% select(t,healthy,sick,dead)
for (.x in 1:n_cycles) {
r_death <- HP2(.x + starting_age - 1, params$heligman_bollard)$hx
# Embedded Version
m_Q <-
matrix(
c(-(r_death + params$r_HS), params$r_HS, r_death,
0, -(r_death), r_death,
0,0,0),
byrow = TRUE,
ncol = 3,
dimnames = list(c("healthy", "sick","dead"), c("healthy", "sick","dead"))
)
m_P <-
expm(m_Q)
# Using Formulas to Convert
p_death <- 1 - exp(-r_death)
p_sick <- 1 - exp(-params$r_HS)
m_P_ <-
matrix(c(
(1-p_death - p_sick), p_sick, p_death,
0,(1-p_death),p_death,
0,0,1
),
byrow=TRUE,
ncol = 3,
dimnames = list(c("healthy", "sick","dead"), c("healthy", "sick","dead"))
)
tmp_ <- c(tr_[.x, "healthy"], tr_[.x, "sick"], tr_[.x, "dead"]) %>% unlist()
tr_ <-
bind_rows(
tr_,
(tmp_ %*% m_P) %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick,dead)
)
tmp_ <- c(tr_alt_[.x, "healthy"], tr_alt_[.x, "sick"], tr_alt_[.x, "dead"]) %>% unlist()
tr_alt_ <-
bind_rows(
tr_alt_,
(tmp_ %*% m_P_) %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick,dead)
)
}
# Cycle adjustment
cycle_adj <-
c(rep(c(4/3,2/3),n_cycles/2),1/3)
cycle_adj[1] <- 1/3
# Life-Expectancy via the discrete time Markov
payoff_living <- matrix(c(1,1,0),byrow=TRUE,ncol=1)
living <- as.matrix(tr_[,c("healthy","sick","dead")]) %*% payoff_living
living_alt <- as.matrix(tr_alt_[,c("healthy","sick","dead")]) %*% payoff_living
life_exp_markov <-
t(cycle_adj) %*% living
life_exp_markov_alt <-
t(cycle_adj) %*% living_alt
# Life-Expectancy via the life table
life_exp_lifetable <-
lt %>% filter(age==starting_age) %>% pull(ex)
tibble(markov = as.vector(life_exp_markov),
markov_alt = as.vector(life_exp_markov_alt),life_table = life_exp_lifetable) %>%
kable(digits = 2) %>%
kable_styling()| markov | markov_alt | life_table |
|---|---|---|
| 40.73 | 40.73 | 41 |
3. Cause-Specific Death
Show that if we assume a static hazard ratio for death (applied to the secular death rate) we will dramatically understate overall life expectancy.
What happens if we just put in a cause-specific hazard ratio that acts on the background mortality rate?
starting_age = .x = 40
tr_ <- tr_alt_ <-
bind_rows(c("healthy" = 1, "sick" = 0, "dead" = 0)) %>% mutate(t = 0) %>% select(t,healthy,sick,dead)
for (.x in 1:n_cycles) {
r_death <- HP2(.x + starting_age - 1, params$heligman_bollard)$hx
# Correct Version
m_Q <-
matrix(
c(-(r_death + params$r_HS), params$r_HS, r_death,
0, -(r_death * params$hr_S), r_death* params$hr_S,
0,0,0),
byrow = TRUE,
ncol = 3,
dimnames = list(c("healthy", "sick","dead"), c("healthy", "sick","dead"))
)
m_P <-
expm(m_Q)
# Using Formulas to Convert
p_death <- 1 - exp(-r_death)
p_sick <- 1 - exp(-params$r_HS)
p_sick_death <- 1 - exp(-(r_death * params$hr_S))
m_P_ <-
matrix(c(
(1-p_death - p_sick), p_sick, p_death,
0,(1-p_sick_death),p_sick_death,
0,0,1
),
byrow=TRUE,
ncol = 3,
dimnames = list(c("healthy", "sick","dead"), c("healthy", "sick","dead"))
)
tmp_ <- c(tr_[.x, "healthy"], tr_[.x, "sick"], tr_[.x, "dead"]) %>% unlist()
tr_ <-
bind_rows(
tr_,
(tmp_ %*% m_P) %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick,dead)
)
tmp_ <- c(tr_alt_[.x, "healthy"], tr_alt_[.x, "sick"], tr_alt_[.x, "dead"]) %>% unlist()
tr_alt_ <-
bind_rows(
tr_alt_,
(tmp_ %*% m_P_) %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick,dead)
)
}
# Cycle adjustment
cycle_adj <-
c(rep(c(4/3,2/3),n_cycles/2),1/3)
cycle_adj[1] <- 1/3
# Life-Expectancy via the discrete time Markov
payoff_living <- matrix(c(1,1,0),byrow=TRUE,ncol=1)
living <- as.matrix(tr_[,c("healthy","sick","dead")]) %*% payoff_living
living_alt <- as.matrix(tr_alt_[,c("healthy","sick","dead")]) %*% payoff_living
life_exp_markov <-
t(cycle_adj) %*% living
life_exp_markov_alt <-
t(cycle_adj) %*% living_alt
# Life-Expectancy via the life table
life_exp_lifetable <-
lt %>% filter(age==starting_age) %>% pull(ex)
tibble(markov = as.vector(life_exp_markov),
markov_alt = as.vector(life_exp_markov_alt),life_table = life_exp_lifetable) %>%
kable(digits = 2) %>%
kable_styling()| markov | markov_alt | life_table |
|---|---|---|
| 34.88 | 34.92 | 41 |
We no longer approximate the ‘natural history’ life-expectancy! Uh oh.
Construct a cause-deleted life table, and use the cause deleted secular deaths instead.
Let’s get an age-specific cause of death data from the Global Burden of Disease project.
These data summarize, by age group, the percentage of overall deaths that are attributable to cardiovascular disease.
ihme_cvd <-
tibble::tribble(
~age_name, ~val,
1L, 0.038771524,
5L, 0.038546046,
10L, 0.044403585,
15L, 0.033781126,
20L, 0.035856165,
25L, 0.053077797,
30L, 0.086001439,
35L, 0.130326551,
40L, 0.184310334,
45L, 0.21839762,
50L, 0.243705394,
55L, 0.256334637,
60L, 0.26828001,
65L, 0.272698709,
70L, 0.28529754,
75L, 0.310642009,
0L, 0.016750489,
80L, 0.353518012,
85L, 0.399856716,
90L, 0.447817792,
95L, 0.495305502
) %>%
mutate(age_ihme = cut(age_name,unique(c(0,1,seq(0,95,5),105)),right=FALSE)) %>%
select(age_ihme, pct_cvd = val) We now need to take our life table data and bin it similarly to the IHME data.
# Source: https://grodri.github.io/demography/neoplasms
# The time to death wtihin intervals is drawn from the "a" column here.
# We really only use it for the 0-1 year old age range.
# url = "https://grodri.github.io/datasets/preston41.dat"
# b41 <- read.table(url, header=FALSE)
# names(b41) <- c("age","D","Di","lx","a")
edit.na <- function(x, value) { x[is.na(x)] <- value; x}
lt_ <-
lt %>%
mutate(age_ihme = cut(age,unique(c(0,1,seq(0,95,5),105)),right=FALSE)) %>%
left_join(ihme_cvd,"age_ihme") %>%
mutate(dx_i = round(dx * pct_cvd)) %>%
select(age_ihme,
age,
D = dx, # Deaths
Di = dx_i, # Cause-specific deaths
lx = lx) %>% # Living
mutate(a = ifelse(age_ihme == "[0,1)", 0.152, 0.5)) %>%
# The conditional probability of dying of a given cause given survival to
# the age group is easy to obtain, we just multiply the overall probability
# by the ratio of deaths of a given cause to all deaths.
mutate(q = edit.na(1 - lead(lx)/lx, 1),
qi = q * Di/D) %>%
# The unconditional counts of deaths of any cause and of a given cause
# are calculated multiplying by the number surviving to the start of each
# age group, which is lx. Recall that to die of cause i in the interval
# [x, x+n) one must survive all causes up to age x.
mutate(d = lx * q,
di = lx * qi) %>%
# In preparation for the next part, note that if we had nmx and we were willing
# to assume that the hazard is constant in each age group we would have had a
# slightly different estimate of the survival function. Let us “back out”
# the rates from the probabilities:
mutate(n = c(diff(age),NA),
m = edit.na( q/(n - q * (n - a)), 1/tail(a,1))) %>% # m[last] = 1/a[last]
# With these rates we compute the cumulative hazard and survival as
mutate(H = cumsum(n * m),
S = edit.na(exp(-lag(H)), 1)) %>% # S[1] = 1
# We compute cause-specific rates by dividing deaths of a given cause into person-years
# of exposure, which is equivalent to multiplying the overall rate by the ratio of
# deaths of a given cause to the total. Here we want deaths for causes other than
# neoplasms. I will use the subscript d for deleted:
mutate(Rd = (D - Di)/D,
md = m * Rd) %>%
# We compute the conditional probability of surviving an age group after
# deleting a cause as the overall probability raised to Rd, and then calculate
# the survival function as a cumulative product
mutate(pd = (1 - q)^Rd,
ld = 100000 * cumprod(c(1, pd[-length(pd)]))) %>%
# Then we construct a survival function in the usual way, but treating this
# hazard as if it was the only one operating:
mutate(Hd = cumsum(n * md),
Sd = edit.na(exp(-lag(Hd)), 1)) %>% # Sd[1] = 1
mutate(Pd = edit.na((Sd - lead(Sd))/md, tail(Sd/md, 1))) %>%
# Now do it for cause-specific death.
mutate(Ri = Di / D,
pi = (1 - qi)^Ri,
li = 100000 * cumprod(c(1, pi[-length(pi)]))) %>%
mutate(mi = m - md)
bx <- mutate(lt_, agem = age + n/2, mi = m - md)[-nrow(lt_), ]
p_cd <-
bx %>%
select(agem,m,md,mi) %>%
gather(series,value,-agem) %>%
mutate(series = factor(series,levels = c("m","md","mi"),labels = c("All Cause", "Non-CVD","CVD"))) %>%
ggplot(aes(x = agem, y = value, colour = series)) + geom_line() + scale_y_log10() +
ggsci::scale_color_aaas() +
geom_dl(method = "smart.grid",aes(label=series)) +
scale_x_continuous(expand = c(0.5,0)) +
theme(legend.position = "none") +
labs(x = "Age", y = "Mortality Rate") +
geom_point(data = lt_ %>% mutate(series = "Life Table") %>% filter(age<100), aes(x = age, y = m), alpha =0.2, colour = "darkblue") +
geom_point(data = lt_ %>% mutate(series = "Life Table") %>% filter(age<100), aes(x = age, y = md), alpha =0.2, colour = "red") +
geom_point(data = lt_ %>% mutate(series = "Life Table") %>% filter(age<100), aes(x = age, y = m - md), alpha =0.2, colour = "darkgreen")
p_cdages_ <- lt_$age[lt_$age<=max_age & lt_$age>=min_age]
deaths_ <- lt_$d[lt_$age<=max_age & lt_$age>=min_age] - lt_$di[lt_$age<=max_age & lt_$age>=min_age]
exposure_ <- lt_$lx[lt_$age<=max_age & lt_$age>=min_age]
hp_fit_ <- MortalityLaw(
x = ages_,
Dx = deaths_, # vector with death counts
Ex = exposure_, # vector containing exposures
law = "HP2",
opt.method = "LF2")Model using cause-specific death rate from the cause-deleted life table data.
Now let’s construct a model that adds in cause-specific death, and uses cause-deleted (modeled) mortality for secular death:
params <- modifyList(params, list(cause_deleted = coef(hp_fit_)))
starting_age = .x = 40
tr_ <- tr_alt_ <-
bind_rows(c("healthy" = 1, "sick" = 0, "cvddeath" = 0, "dead" = 0)) %>% mutate(t = 0) %>% select(t,healthy,sick,cvddeath,dead)
for (.x in 1:n_cycles) {
current_age <- min(.x + starting_age - 1,max(lt_$age))
r_death <- HP2(.x + starting_age - 1, params$cause_deleted)$hx
r_cause <- lt_ %>% filter(age==current_age) %>% pull(mi)
# Correct Version
m_Q <-
matrix(
c(-(r_death + params$r_HS), params$r_HS, 0, r_death,
0, -(r_cause+r_death), r_cause,r_death,
0,0,0,0,
0,0,0,0),
byrow = TRUE,
ncol = 4,
dimnames = list( c("healthy", "sick","cvddeath","dead"), c("healthy", "sick","cvddeath","dead"))
)
m_P <-
expm(m_Q)
# Using Formulas to Convert
p_death <- 1 - exp(-r_death)
p_sick <- 1 - exp(-params$r_HS)
p_sick_death <- 1 - exp(-(r_cause))
m_P_ <-
matrix(c(
(1-p_death - p_sick), p_sick, 0,p_death,
0,(1-p_sick_death - p_death),p_sick_death,p_death,
0,0,0,0,
0,0,0,1
),
byrow=TRUE,
ncol = 4,
dimnames = list(c("healthy", "sick","cvddeath","dead"), c("healthy", "sick","cvddeath","dead"))
)
tmp_ <- c(tr_[.x, "healthy"], tr_[.x, "sick"], tr_[.x, "cvddeath"],tr_[.x, "dead"]) %>% unlist()
tr_ <-
bind_rows(
tr_,
(tmp_ %*% m_P) %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick,cvddeath,dead)
)
tmp_ <- c(tr_alt_[.x, "healthy"], tr_alt_[.x, "sick"], tr_alt_[.x, "cvddeath"], tr_alt_[.x, "dead"]) %>% unlist()
tr_alt_ <-
bind_rows(
tr_alt_,
(tmp_ %*% m_P_) %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick,cvddeath,dead)
)
}
# Cycle adjustment
cycle_adj <-
c(rep(c(4/3,2/3),n_cycles/2),1/3)
cycle_adj[1] <- 1/3
# Life-Expectancy via the discrete time Markov
payoff_living <- matrix(c(1,1,0,0),byrow=TRUE,ncol=1)
living <- as.matrix(tr_[,c("healthy","sick","cvddeath","dead")]) %*% payoff_living
living_alt <- as.matrix(tr_alt_[,c("healthy","sick","cvddeath","dead")]) %*% payoff_living
life_exp_markov <-
t(cycle_adj) %*% living
life_exp_markov_alt <-
t(cycle_adj) %*% living_alt
# Life-Expectancy via the life table
life_exp_lifetable <-
lt %>% filter(age==starting_age) %>% pull(ex)
tibble(markov = as.vector(life_exp_markov),
markov_alt = as.vector(life_exp_markov_alt),life_table = life_exp_lifetable) %>%
kable(digits = 2) %>%
kable_styling()| markov | markov_alt | life_table |
|---|---|---|
| 40.97 | 40.88 | 41 |
lt_ %>%
filter(age>=40) %>%
mutate(qd = 1 - exp(-md)) %>%
select(-pd) %>%
mutate(pd = 1 - qd) %>%
mutate(cump = ifelse(row_number()==1,1,lag(cumprod(pd),1)) ) %>%
select(-lx) %>%
mutate(lx = N * cump) %>%
mutate(ditest = lx * qd) %>%
filter(age<=85) %>%
ggplot(aes(x = age, y = cump)) + geom_step() +
geom_line(data = tr_ %>% filter(t<=45), aes(x = starting_age + t, y = 1 - dead ), colour = "red") + labs(x = "Age", y = "Survival")Good news: we essentially replicate overall life expectancy! Though note the small difference with the model based on formula conversions rather than embedding.
- Note that we used the cause-deleted life table’s cause-specific mortality rate (
mi) here. Another alternative would be to use a similar mortality model as above – but in practice, for CVD deaths this does not yield good predictions for old ages (see plot )
ages_ <- lt_$age[lt_$age<=max_age & lt_$age>=min_age]
deaths_i <- lt_$di[lt_$age<=max_age & lt_$age>=min_age]
exposure_i <- lt_$li[lt_$age<=max_age & lt_$age>=min_age]
hp_fit_i <- MortalityLaw(
x = ages_,
Dx = deaths_i, # vector with death counts
Ex = exposure_i, # vector containing exposures
law = "HP2",
opt.method = "LF2")
params <- modifyList(params, list(cause_deleted = coef(hp_fit_),
cause_specific = coef(hp_fit_i)))
plot(hp_fit_i)4. Ensuring Accurate Counts, Costs and QALYs
Count the number who become ill under various approaches.
Can also try Fernando’s approach
starting_age = .x = 40
starting_pop = N
tr_ <-
bind_rows(c("healthy" = N, "sick" = 0, "cvddeath" = 0, "dead" = 0, "accsick" = 0)) %>% mutate(t = 0) %>% select(t,healthy,sick,cvddeath,dead,accsick)
tr_alt_ <-
bind_rows(c("healthy" = N, "sick" = 0, "cvddeath" = 0, "dead" = 0)) %>% mutate(t = 0) %>% select(t,healthy,sick,cvddeath,dead)
tr_nodeath <-
bind_rows(c("healthy" = N, "sick" = 0)) %>% mutate(t = 0) %>% select(t,healthy,sick)
# transition array
arr_ <- arr_alt_ <- arr_nodeath <- list()
for (.x in 1:n_cycles) {
current_age <- min(.x + starting_age - 1,max(lt_$age))
r_death <- HP2(.x + starting_age - 1, params$cause_deleted)$hx
r_cause <- lt_ %>% filter(age==current_age) %>% pull(mi)
m_Q_nodeath <-
matrix(
c(-params$r_HS,params$r_HS,
0,0),
byrow=TRUE,
ncol = 2,
dimnames = list(c("healthy","sick"),c("healthy","sick"))
)
m_P_nodeath <-
expm(m_Q_nodeath)
tmp_nodeath <- c(tr_nodeath[.x, "healthy"], tr_nodeath[.x, "sick"]) %>% unlist()
tr_nodeath <-
bind_rows(
tr_nodeath,
(tmp_nodeath %*% m_P_nodeath) %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick)
)
# Correct Version
m_Q <-
matrix(
c(-(r_death + params$r_HS), params$r_HS, 0, r_death,
0, -(r_cause+r_death), r_cause,r_death,
0,0,0,0,
0,0,0,0),
byrow = TRUE,
ncol = 4,
dimnames = list( c("healthy", "sick","cvddeath","dead"), c("healthy", "sick","cvddeath","dead"))
)
# Add the accumulator
m_Q_acc <-
cbind(rbind(m_Q,rep(0,nrow(m_Q))),rep(0,ncol(m_Q)+1))
rownames(m_Q_acc) = c(rownames(m_Q),"accsick")
colnames(m_Q_acc) = c(colnames(m_Q),"accsick")
m_Q_acc["healthy","accsick"] <- m_Q_acc["healthy","sick"]
m_P <-
expm(m_Q_acc)
# Using Formulas to Convert
p_death <- 1 - exp(-r_death)
p_sick <- 1 - exp(-params$r_HS)
p_sick_death <- 1 - exp(-(r_cause))
m_P_ <-
matrix(c(
(1-p_death - p_sick), p_sick, 0,p_death,
0,(1-p_sick_death - p_death),p_sick_death,p_death,
0,0,0,0,
0,0,0,1
),
byrow=TRUE,
ncol = 4,
dimnames = list(c("healthy", "sick","cvddeath","dead"), c("healthy", "sick","cvddeath","dead"))
)
tmp_ <- c(tr_[.x, "healthy"], tr_[.x, "sick"], tr_[.x, "cvddeath"],tr_[.x, "dead"],tr_[.x,"accsick"]) %>% unlist()
tr_ <-
bind_rows(
tr_,
(tmp_ %*% m_P) %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick,cvddeath,dead,accsick)
)
arr_tmp <- diag(tmp_)
arr_[[.x]] <-
arr_tmp %*% m_P
tmp_alt_ <- c(tr_alt_[.x, "healthy"], tr_alt_[.x, "sick"], tr_alt_[.x, "cvddeath"], tr_alt_[.x, "dead"]) %>% unlist()
tr_alt_ <-
bind_rows(
tr_alt_,
(tmp_alt_ %*% m_P_) %>% as.matrix() %>% data.frame() %>% mutate(t = .x) %>% select(t,healthy,sick,cvddeath,dead)
)
arr_alt_tmp <- diag(tmp_alt_)
arr_alt_[[.x]] <-
arr_alt_tmp %*% m_P_
}
# Cycle adjustment
cycle_adj <-
c(rep(c(4/3,2/3),n_cycles/2),1/3)
cycle_adj[1] <- 1/3
# Embedded
payoff_sick <- matrix(c(0,0,0,0,1))
sick <- as.matrix(tr_[,c("healthy","sick","cvddeath","dead","accsick")]) %*% payoff_sick
sick_embedded <- t(cycle_adj) %*% sick
# Formulas
payoff_sick_alt <- matrix(c(0,1,0,0))
sick_alt <- as.matrix(tr_[,c("healthy","sick","cvddeath","dead")]) %*% payoff_sick_alt
sick_formula <- t(cycle_adj) %*% sick_alt
# No Death
payoff_sick_nodeath <- matrix(c(0,1))
sick_nodeath <- as.matrix(tr_nodeath[,c("healthy","sick")]) %*% payoff_sick_nodeath
sick_nodeath <- t(cycle_adj) %*% sick_nodeath
sick_nodeath [,1]
[1,] 8965048
sick_formula [,1]
[1,] 3166013
sick_embedded [,1]
[1,] 8706612
Why the divergence? At later years the hazard of death is very high – but the hazard of dying from CVD is also very high!
Let’s take a look at what happens in the first cycle.
Here is the Markov trace for the embedded matrix:
tr_[2,]# A tibble: 1 × 6
t healthy sick cvddeath dead accsick
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 90336. 9499. 1.86 163. 9509.
And here it is for the matrix constructed using conversion formulas:
tr_alt_[2,]# A tibble: 1 × 5
t healthy sick cvddeath dead
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1 90321. 9516. 0 163.
We see that in both, 163 people die. But there are more people who remain sick in the trace based on the formulas (9516) than in the trace based on the embedded matrix (9499). Why?
9516 is the total who would become sick if we don’t allow any death in the interval for those who would become sick in the interval.
That is, let’s think about the population who would become sick in an interval. There are four mutually exclusive groups:
- Individuals who become sick in the interval and stay sick throughout the interval.
- Individuals who would have become sick in the interval, but died first.
- Individuals who became sick in the interval, but also died from the illness within the interval.
- Individuals who became sick in the interval, but also died from background causes within the interval.
The total number 9516 sick at the end of the interval based on a matrix converted using standard formulas reflects the union of all the above. We’re essentially “trapping” these patients within the interval and not allowing them to die for any other reason–either before or after they become ill.
Adding Non-Markovian Elements to the Transition Probability Matrix
Accumulators
Tunnel states
We will create a T1 state for tracking tunnels and allow exit from due to external death. This requires us to have a recieving bucket for these external risks ‘N’ for NULL so as not to disturb the balance of the Markov.
m = number of primary health states
- Create m x m rate matrix R with initial cell values of 0
- Fill in off-diagonal elements per parameters
- Diagonal of R is negative row sum of off-diagonal elements
- Add accumulator columns and rows as 0s. Call this new matrix R_
- Fill in appropriate accumulator transition rates in R_
- Add in tunnel states as new columns and rows
- Add an additional “N” (NULL) state as recieving bucket external risks (e.g., background mortality) that can occur while in the tunnel so as not to disturb the balance of the Markov.
- Take matrix exponential of R_ to obtain M_, the initial embedded transition probability matrix
- Make edits to R_ to reflect tunnel state activity.
- Since it is possible to exit the tunnel for background mortality reasons, add the total background mortality
starting_age = 40
markov_rate_matrices<- function(t)
{
lapply(t, function(tt){
current_age <- min(tt + starting_age - 1,max(lt_$age))
r_death <- HP2(current_age, params$cause_deleted)$hx
r_cause <- lt_ %>% filter(age==current_age) %>% pull(mi)
x <- matrix(c(0.0, params$r_HS, 0.0, r_death,
0.0, 0.0, r_cause, r_death,
0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0),
nrow=4,
ncol=4,
byrow=TRUE,
dimnames=list(c("healthy", "sick","cvddeath","dead"),
c("healthy", "sick","cvddeath","dead")))
diag(x) <- -rowSums(x)
x
})
}
non_markov_rate_matrices <- function(t)
{
markov <- markov_rate_matrices(t)
non_markov <- list()
for(i in t)
{
# Expand matrix
non_markov[[i]] <- cbind(markov[[i]], matrix(rep(0, 4*3), nrow=4))
non_markov[[i]] <- rbind(non_markov[[i]], matrix(rep(0, 3*7), nrow=3))
}
lapply(non_markov, function(m){
# Put in State Names
rownames(m) <- c("healthy", "sick", "cvddeath", "dead", "accsick", "sickT1", "N")
colnames(m) <- c("healthy", "sick", "cvddeath", "dead", "accsick", "sickT1", "N")
# Define Accumulator
m["healthy", "accsick"] <- m["healthy", "sick"]
# Define Tunnel state entry
m["healthy", "sickT1"] <- m["healthy", "sick"]
m # Note: Tunnel states are not fully defined at this point.
})
}Next we embed the non-Markovian matrices into the time step and finish the definition of the tunnel states in the probability space.
transition_prob <- function(t,h=1)
{
# Embed the matrices into the timesteps
tp <- lapply(non_markov_rate_matrices(t), function(m) expm(m*h))
lapply(tp, function(m)
{
# It is possible to exit tunnel to external risk of death
m["sickT1", "N"] <- m["healthy", "cvddeath"] + m["healthy", "dead"]
m["sickT1", "sickT1"] <- 0 # Cannot remain in tunnel
# last state in the tunnel is a terminal state
m["sickT1", "N"] <- 1
# Note: At this point, the "N" state could be stripped as it was
# only required for the embedding, and serves no other purpose
# at this point
states <- c("healthy", "sick", "cvddeath", "dead", "accsick", "sickT1")
m[states, states]
})
}5. Backwards Conversion
m_P <- matrix(c(0.721, 0.202, 0.067, 0.010,
0.000, 0.581, 0.407, 0.012,
0.000, 0.000, 0.750, 0.250,
0.000, 0.000, 0.000, 1.000),
nrow=4, byrow=TRUE,
dimnames=list(c("A", "B", "C", "D"),
c("A", "B", "C", "D")))
m_P A B C D
A 0.721 0.202 0.067 0.010
B 0.000 0.581 0.407 0.012
C 0.000 0.000 0.750 0.250
D 0.000 0.000 0.000 1.000
The bottom diagonal was corrected to be a Markov absorbing state by having 1 on the diagonal. For the purposes of example, we’ll assume we wish to add an additional state “E”, which has a continuous rate of 0.2 that competes with other transitions.
Take the HIV model and use eigenvalue decomposition to obtain the continuous generator matrix.
To accomplish this we must find the continuous generator.
V <- eigen(m_P)$vectors
iV <- solve(V)
Ap <- iV %*% m_P %*% V
Ap [,1] [,2]
[1,] 1.000000000000000000000 -0.000000000000000011388
[2,] 0.000000000000000172422 0.750000000000000111022
[3,] -0.000000000000001054209 -0.000000000000000197747
[4,] -0.000000000000000062574 -0.000000000000000018793
[,3] [,4]
[1,] 0.0000000000000000000000 0.000000000000000016253
[2,] 0.0000000000000000000000 -0.000000000000000156542
[3,] 0.7209999999999999742428 0.000000000000000036544
[4,] -0.0000000000000000059637 0.580999999999999960920
Due to the numeric probabilities not being exactly correct, the off-diagonal elements of A' are not zero, but they are quite close. We will zero these off diagonal elements and assume that the non-zero elements are numerical error. Then continue by taking the log of the diagonal.
lAp <- diag(log(diag(Ap)), nrow(Ap), ncol(Ap))
R <- V %*% lAp %*% iV
R [,1] [,2] [,3] [,4]
[1,] -0.3271161416971880009363 0.3114960917676688478828 0.0024395 0.013181
[2,] 0.0000000000000000025585 -0.5430045221302257640872 0.6148890 -0.071884
[3,] 0.0000000000000000000000 0.0000000000000000070641 -0.2876821 0.287682
[4,] 0.0000000000000000000000 0.0000000000000000000000 0.0000000 0.000000
An now we have the continuous time rate generator for the Markov Model. There is still some numerical error—for example the bottom row, has values very near to zero, and the diagonal is not exactly the negative sum of the rest of the row. We can clean this up by tweaking the numbers numerical towards their constraints.
R[abs(R) < 1e-6 ] <- 0
rownames(R) <- c("A", "B", "C", "D")
colnames(R) <- c("A", "B", "C", "D")
round(R, 3) A B C D
A -0.327 0.311 0.002 0.013
B 0.000 -0.543 0.615 -0.072
C 0.000 0.000 -0.288 0.288
D 0.000 0.000 0.000 0.000
Some numerical error is inevitable in this process and cannot be avoided. However, even cleaning up the small and obvious errors, the transition from B -> D is impossible since it’s negative. Rates are relative to the occupancy of the source state, in this case B. Having a negative rate implies that B would have some transitions in which the dead come alive into B based upon the occupancy of B. Obviously, this is not possible. The problem now is how to adjust the model’s rates to be physically possible, while as faithful to the original data as possible. B is a sicker state, so it should have a higher death rate.
Demonstrate that the embedded (new) transition probability matrix successfully recreates the original transition probability matrix and results.
Let’s double check it recapitulates the original rates.
expm(R)4 x 4 Matrix of class "dgeMatrix"
A B C D
A 0.721 0.202 0.067 0.010
B 0.000 0.581 0.407 0.012
C 0.000 0.000 0.750 0.250
D 0.000 0.000 0.000 1.000
Now let’s replicate the monotherapy strategy results from the original.
params_hiv <-
list(
R = R,
dmca = 1701 , # Direct medical costs associated with state A
dmcb = 1774 , # Direct medical costs associated with state B
dmcc = 6948 , # Direct medical costs associated with state C
ccca = 1055 , # Community care costs associated with state A
cccb = 1278 , # Community care costs associated with state B
cccc = 2059 , # Community care costs associated with state C
cAZT = 2278 , # Zidovudine drug cost
cLam = 2087 , # Lamivudine drug cost
RR = 0.509, # Treatment effect (RR)
cDR = 0.06, # Annual discount rate - costs (%)
oDR = 0 # Annual discount rate - benefits (%)
)
starting_age = 40
markov_rate_matrices <- function(t)
{
lapply(t, function(tt) {
current_age <- min(tt + starting_age - 1, max(lt_$age))
# r_death <- HP2(current_age, params$cause_deleted)$hx
# r_cause <- lt_ %>% filter(age==current_age) %>% pull(mi)
x <- params_hiv$R
diag(x) <- 0
diag(x) <- -rowSums(x)
x
})
}
transition_prob <- function(t,h=1)
{
# Embed the matrices into the timesteps
tp <- lapply(markov_rate_matrices(t), function(m) expm(m*h))
}
Y <- t(c(A = 1, B = 0, C = 0, D = 0))
res_mono <- do.call(rbind, lapply(transition_prob(1:20), function(tp) {
Y <<- Y %*% tp
}))
res_mono <- rbind(
c(A = 1, B = 0, C = 0, D = 0),
res_mono
)
res_mono <- round(res_mono,3)
payoff_health = matrix(c(1,1,1,0),byrow=TRUE,ncol=1,dimnames=list(c("A","B","C","D"),c("qaly")))
qaly_mono <- as.matrix(res_mono[-1,]) %*% payoff_health
qaly_disc_mono <- qaly_mono / (1 + params_hiv$oDR)^c(1:(nrow(res_mono)-1))
payoff_cost_mono <- with(params_hiv,matrix(c(dmca + ccca + cAZT,
dmcb + cccb + cAZT,
dmcc + cccc + cAZT,
0)))
cost_mono <- as.matrix(res_mono[-1,]) %*% payoff_cost_mono %>% round(.,0)
cost_disc_mono <- cost_mono / (1 + params_hiv$cDR)^c(1:(nrow(res_mono)-1))
sum(qaly_disc_mono)[1] 7.974
sum(round(cost_disc_mono,0))[1] 44589
This compares favorably to the textbook answer, which is 7.99 for QALYs and £44,663 for costs. The main differences seem to be rounding in the Excel answer within the textbook.
6. Changing Countries or Contexts
Rather than have a static death rate, let’s swap in secular mortality from several different countries and regions.
tibble::tribble(
~code, ~country,
"AUS", "Australia",
"AUT", "Austria",
"BLR", "Belarus",
"BEL", "Belgium",
"BGR", "Bulgaria",
"CAN", "Canada",
"CHL", "Chile",
"CZE", "Czech Republic",
"DNK", "Denmark",
"EST", "Estonia",
"FIN", "Finland",
"FRATNP", "France (total population)",
"FRACNP", "France (civilian population)",
"DEUTNP", "Germany (total population)",
"DEUTE", "Germany (east)",
"DEUTW", "Germany (west)",
"GRC", "Greece",
"HUN", "Hungary",
"ISL", "Iceland",
"IRL", "Ireland",
"ISR", "Israel",
"ITA", "Italy",
"JPN", "Japan",
"LVA", "Latvia",
"LTU", "Lithuania",
"LUX", "Luxembourg",
"NLD", "Netherlands",
"NZL_NP", "New Zealand (total population)",
"NZL_MA", "New Zealand (Maori population)",
"NZL_NM", "New Zealand (non-Maori population)",
"NOR", "Norway",
"POL", "Poland",
"PRT", "Portugal",
"RUS", "Russia",
"SVK", "Slovakia",
"SVN", "Slovenia",
"ESP", "Spain",
"SWE", "Sweden",
"CHE", "Switzerland",
"TWN", "Taiwan",
"GBR_NP", "United Kingdom",
"GBRTENW", "England & Wales (total population)",
"GBRCENW", "England & Wales (civilian population)",
"GBR_SCO", "Scotland",
"GBR_NIR", "Northern Ireland",
"USA", "U.S.A.",
"UKR", "Ukraine"
) %>%
kable() %>%
kable_styling()| code | country |
|---|---|
| AUS | Australia |
| AUT | Austria |
| BLR | Belarus |
| BEL | Belgium |
| BGR | Bulgaria |
| CAN | Canada |
| CHL | Chile |
| CZE | Czech Republic |
| DNK | Denmark |
| EST | Estonia |
| FIN | Finland |
| FRATNP | France (total population) |
| FRACNP | France (civilian population) |
| DEUTNP | Germany (total population) |
| DEUTE | Germany (east) |
| DEUTW | Germany (west) |
| GRC | Greece |
| HUN | Hungary |
| ISL | Iceland |
| IRL | Ireland |
| ISR | Israel |
| ITA | Italy |
| JPN | Japan |
| LVA | Latvia |
| LTU | Lithuania |
| LUX | Luxembourg |
| NLD | Netherlands |
| NZL_NP | New Zealand (total population) |
| NZL_MA | New Zealand (Maori population) |
| NZL_NM | New Zealand (non-Maori population) |
| NOR | Norway |
| POL | Poland |
| PRT | Portugal |
| RUS | Russia |
| SVK | Slovakia |
| SVN | Slovenia |
| ESP | Spain |
| SWE | Sweden |
| CHE | Switzerland |
| TWN | Taiwan |
| GBR_NP | United Kingdom |
| GBRTENW | England & Wales (total population) |
| GBRCENW | England & Wales (civilian population) |
| GBR_SCO | Scotland |
| GBR_NIR | Northern Ireland |
| USA | U.S.A. |
| UKR | Ukraine |
#hmd.ger <- demography::hmd.mx("DEUTNP",username = "", password = "", "")
#write_rds(hmd.ger,file=here("_sandbox/mortality/germany-life-table.rds"))
#hmd.pol <- demography::hmd.mx("POL",username = "", password = "", "POL")
#write_rds(hmd.pol,file=here("_sandbox/mortality/poland-life-table.rds"))Fit a mortality model to German life-table data and use the mortality rate from this model to estimate QALYs and costs
# Fit the mortality model
min_age = 40
max_age = 99
hmd_ger <- read_rds(here("_sandbox/mortality/germany-life-table.rds"))
lt_ger <- # Read in the U.S. life table data from the human mortality database.
hmd_ger %>%
lifetable(.,series = "total", years = mort_year) %>%
as_tibble() %>%
mutate_at(vars(lx,dx), function(x) x*N) %>%
mutate(country = "USA") %>%
mutate(age = x)
ages <- lt_ger$x[lt_ger$x<=max_age & lt_ger$x>=min_age]
deaths <- lt_ger$dx[lt_ger$x<=max_age & lt_ger$x>=min_age]
exposure <- lt_ger$lx[lt_ger$x<=max_age & lt_ger$x>=min_age]
hp_fit_ger <- MortalityLaw(
x = ages,
Dx = deaths, # vector with death counts
Ex = exposure, # vector containing exposures
law = "HP2",
opt.method = "LF2")
params_hiv <- modifyList(params_hiv, list(mort = coef(hp_fit_ger),
hr_mort = 2))markov_rate_matrices <- function(t)
{
lapply(t, function(tt) {
current_age <- min(tt + starting_age - 1, max(lt_$age))
r_death <- params_hiv$hr_mort * HP2(current_age, params_hiv$mort)$hx
x <- params_hiv$R
x["A","D"] <- r_death
diag(x) <- 0
diag(x) <- -rowSums(x)
x
})
}
transition_prob <- function(t,h=1)
{
# Embed the matrices into the timesteps
tp <- lapply(markov_rate_matrices(t), function(m) expm(m*h))
tp
}
Y <- t(c(A = 1, B = 0, C = 0, D = 0))
res_mono <- do.call(rbind, lapply(transition_prob(1:20), function(tp) {
Y <<- Y %*% tp
}))
res_mono <- rbind(
c(A = 1, B = 0, C = 0, D = 0),
res_mono
)
res_mono <- round(res_mono,3)
payoff_health = matrix(c(1,1,1,0),byrow=TRUE,ncol=1,dimnames=list(c("A","B","C","D"),c("qaly")))
qaly_mono <- as.matrix(res_mono[-1,]) %*% payoff_health
qaly_disc_mono <- qaly_mono / (1 + params_hiv$oDR)^c(1:(nrow(res_mono)-1))
payoff_cost_mono <- with(params_hiv,matrix(c(dmca + ccca + cAZT,
dmcb + cccb + cAZT,
dmcc + cccc + cAZT,
0)))
cost_mono <- as.matrix(res_mono[-1,]) %*% payoff_cost_mono %>% round(.,0)
cost_disc_mono <- cost_mono / (1 + params_hiv$cDR)^c(1:(nrow(res_mono)-1))
sum(qaly_disc_mono)[1] 8.251
sum(round(cost_disc_mono,0))[1] 45928
7. Inpororating new evidence
Suppose a new combination therapy treatment has been approved. What is the cost effectiveness relative to monotherapy?
8. Solving for PSA parameters
The new combination therapy arm has some new parameters with 95% CIs.
Solve for the PSA distribution parameters
How Do We Draw PSA Values?
| Parameter Type | Distribution |
|---|---|
| Probability | beta |
| Rate | gamma |
| Utility weight | beta |
| Right skew (e.g., cost) | gamma, lognormal |
| Relative risks or hazard ratios | lognormal |
| Odds Ratio | logistic |
Normal Distribution
\sigma = \frac{x_2 - x_1}{\Phi^{-1}(p_2)-\Phi^{-1}(p_1)} \tag{1}
\mu = \frac{x_1\Phi^{-1}(p_2)-x_2\Phi^{-1}(p_1)}{\Phi^{-1}(p_2)-\Phi^{-1}(p_1)}
x1 <- qnorm(0.1,1.3,.2)
p1 <- 0.1
x2 <- qnorm(0.9,1.3,.2)
p2 <- 0.9
param_normal <- function(x1,x2,p1,p2) {
sigma = (x2 - x1) / (qnorm(p2,0,1)-qnorm(p1,0,1)); sigma
mu <- (x1*qnorm(p2,0,1)-x2*qnorm(p1,0,1))/(qnorm(p2,0,1)-qnorm(p1,0,1)); mu
c("mu" = mu, "sigma" = sigma)
}Gamma Distribution
x1 <- 0.6
x2 <- 0.8
p1 <- 0.1
p2 <- 0.9
gamma_fn <- function(alpha) {
x1*qgamma(p2,shape = alpha, scale =1) - x2 * qgamma(p1, shape = alpha, scale = 1)
}
calc_beta <- function(x1,p1,alpha) {
x1 / qgamma(p1,alpha,1)
}
curve(gamma_fn, xlim = c(1,100), col = "blue", lwd = 1.5, lty=2)
abline(a=0,b=0)alpha_ <- uniroot(gamma_fn,c(70,85))$root; alpha_[1] 79.748
beta_ <- calc_beta(x1 = x1, p1 = p1, alpha = alpha_); beta_[1] 0.0087542
# Check the answer
qgamma(0.1,shape = alpha_, scale = beta_)[1] 0.6
qgamma(0.9,shape = alpha_, scale = beta_)[1] 0.8
param_gamma <- function(x1,x2,p1,p2,range) {
alpha_ <- uniroot(gamma_fn,range)$root; alpha_
beta_ <- calc_beta(x1 = x1, p1 = p1, alpha = alpha_); beta_
c("alpha" = alpha_, "beta" = beta_)
}
param_gamma(x1= x1, x2 = x2, p1 = p1, p2 = p2, range = c(60,100)) alpha beta
79.7475577 0.0087542
Beta Distribution
# Source: https://stats.stackexchange.com/questions/112614/determining-beta-distribution-parameters-alpha-and-beta-from-two-arbitrary
x1 <- 0.6
x2 <- 0.8
p1 <- 0.1
p2 <- 0.9
# Logistic transformation of the Beta CDF.
f.beta <- function(alpha, beta, x, lower=0, upper=1) {
p <- pbeta((x-lower)/(upper-lower), alpha, beta)
log(p/(1-p))
}
# Sums of squares.
delta <- function(fit, actual) sum((fit-actual)^2)
# The objective function handles the transformed parameters `theta` and
# uses `f.beta` and `delta` to fit the values and measure their discrepancies.
objective <- function(theta, x, prob, ...) {
ab <- exp(theta) # Parameters are the *logs* of alpha and beta
fit <- f.beta(ab[1], ab[2], x, ...)
return (delta(fit, prob))
}
x.p <- (function(p) log(p/(1-p)))(c(p1, p2))
start <- log(c(1e1, 1e1))
sol <- nlm(objective, start, x=c(x1,x2), prob=x.p, lower=0, upper=1, typsize=c(1,1), fscale=1e-12, gradtol=1e-12)
params <- exp(sol$estimate); params[1] 23.707 10.031
qbeta(p = c(p1, p2), params[1], params[2])[1] 0.6 0.8
9. Simulating PSA values from copulas
######################################
## Option 1: Cholesky Decomposition
######################################
# Source: Coping with Copulas Sec 5.1
# 1. Define the correlation matrix
rho <- 0.6
sigma <- matrix(c(1,rho,rho,1),nrow = 2, byrow=TRUE)
# 2. Perform a cholesky decomposition
A = chol(sigma)
# 3. Generate iid standard normal pseudo random variables
tildeY0 <- rnorm(n = 1e6, mean = 0, sd = 1)
tildeY1 <- rnorm(n = 1e6, mean = 0, sd = 1)
tildeY <- cbind(tildeY0,tildeY1)
# Collect them
tildeY <- tildeY %*% A
# Use the standard normal disribution function to return the quantiles
U <- pnorm(tildeY)
# Use the inverse distribution function to return the values
Y <- U
Y[,1] <- qpois(U[,1],lambda = 10)
Y[,2] <- qpois(U[,2],lambda = 12)
# Confirm the correct correlation
cor(Y[,1],Y[,2])[1] 0.59356
##########################
## Option 2: Use MVRNORM
##########################
U_ <- pnorm(mvrnorm(n = 1e6, Sigma = sigma, mu = c(0,0)))
Y_ <- U_
Y_[,1] <- qpois(U_[,1],lambda = 10)
Y_[,2] <- qpois(U_[,2],lambda = 12)
cor(Y_[,1],Y_[,2])[1] 0.5921
Simulate direct medical care and community care costs for each state using copulas.
rho <- 0.6
sigma <- matrix(c(1,rho,rho,1),nrow = 2, byrow=TRUE)
U_ <- pnorm(mvrnorm(n = 1e3, Sigma = sigma, mu = c(0,0)))
Y_ <- U_
Y_[,1] <- qpois(U_[,1],lambda = 10)
Y_[,2] <- qpois(U_[,2],lambda = 12)
cor(Y_[,1],Y_[,2])[1] 0.60362